Lookaside buffer for address translation in a computer system

ABSTRACT

A method and apparatus for performing address translation in a computer system supporting virtual memory by searching a translation lookaside buffer (TLB) and, possibly, a translation table held in memory and implemented as a B-tree data structure. The TLB is initially searched for a translation for a specified input address. If exactly one valid entry of the TLB stores a translation for the specified input address then the output address corresponding to the specified input address is determined from the contents of that entry. Otherwise, the translation table is searched for a translation for the specified input address. If two or more valid entries of the TLB store a translation for the specified input address then these entries are invalidated. If a search of the translation table is required then the method involves the retrieval from the translation table, and insertion into the TLB, of a translation for the specified input address and possibly one or more translations for other input addresses that are stored together with the translation for the specified input address in one node of the B-tree implementing the translation table. During the insertion into the TLB of a translation for a particular input address that was retrieved from the translation table it is determined if there is exactly one valid entry in the TLB that stores a translation for the particular input address. If so, then the translation retrieved from the memory is inserted into that entry, thereby avoiding the creation of multiple TLB entries for the same input address.

This application is a continuation of application Ser. No. 08/397,809,filed Mar. 3, 1995, U.S. Pat. No. 5,680,566.

RELATED APPLICATIONS

The subject matter of this application is related to the subject matterof the following applications:

application Ser. Nos. 08/397,810 and 08/465,166 entitled "PARALLELACCESS MICRO-TLB TO SPEED UP ADDRESS TRANSLATION" filed on Mar. 3, 1995by Chih-Wei David Chang, Kioumars Dawallu, Joel F. Boney, Ming-Ying Liand Jen-Hong Charles Chen;

application Ser. No. 08/388,602 entitled "INSTRUCTION FLOW CONTROLCIRCUIT FOR SUPERSCALER MICROPROCESSOR" filed on Feb. 14, 1995 byTakeshi Kitahara;

application Ser. No. 08/388,389 entitled "ADDRESSING METHOD FOREXECUTING LOAD INSTRUCTIONS OUT OF ORDER WITH RESPECT TO STOREINSTRUCTIONS" filed on Feb. 14, 1995 by Michael A. Simone and Michael C.Shebanow;

application Ser. No. 08/518,549, a continuation of application Ser. No.08/388,606 (now abandoned), entitled "METHOD AND APPARATUS FOREFFICIENTLY WRITING RESULTS TO RENAMED REGISTERS" filed on Feb. 14, 1995by DeForest W. Tovey, Michael C. Shebanow and John Gmuender;

application Ser. No. 08/516,230, a continuation of application Ser. No.08/388,364 (now abandoned), entitled "METHOD AND APPARATUS FORCOORDINATING THE USE OF PHYSICAL REGISTERS IN A MICROPROCESSOR" filed onFeb. 14, 1995 by DeForest W. Tovey, Michael C. Shebanow and JohnGmuender;

application Ser. No. 08/390,885 entitled "PROCESSOR STRUCTURE AND METHODFOR TRACKING INSTRUCTION STATUS TO MAINTAIN PRECISE STATE" filed on Feb.14, 1995 by Gene W. Shen, John Szeto, Niteen A. Patkar and Michael C.Shebanow;

application Ser. No. 08/522,567, a continuation of application Ser. No.08/397,893 (now abandoned) entitled "RECLAMATION OF PROCESSOR RESOURCESIN A DATA PROCESSOR" filed on Mar. 3, 1995 by Michael C. Shebanow, GeneW. Shen, Ravi Swami, Niteen Patkar;

application Ser. No. 08/523,384, a continuation of application Ser. No.08/397,891 (now abandoned) entitled "METHOD AND APPARATUS FOR SELECTINGINSTRUCTIONS FROM ONES READY TO EXECUTE" filed on Mar. 3, 1995 byMichael C. Shebanow, John Gmuender, Michael A. Simone, John R. F. S.Szeto, Takumi Maruyama and DeForest W. Tovey;

application Ser. No. 08/397,911 entitled "HARDWARE SUPPORT FOR FASTSOFTWARE EMULATION OF UNIMPLEMENTED INSTRUCTIONS" filed on Mar. 3, 1995by Shalesh Thusoo, Farnad Sajjadian, Jaspal Kohli, and Niteen Patkar;

application Ser. No. 08/398,284 entitled "METHOD AND APPARATUS FORACCELERATING CONTROL TRANSFER RETURNS" filed on Mar. 3, 1995 by AkiroKatsuno, Sunil Savkar and Michael C. Shebanow;

application Ser. No. 08/524,294, a continuation of application Ser. No.08/398,066 (now abandoned) entitled "METHODS FOR UPDATING FETCH PROGRAMCOUNTER" filed on Mar. 3, 1995 by Akira Katsuno, Niteen A. Patkar, SunilSavkar and Michael C. Shebanow;

application Ser. No. 08/398,151 entitled "METHOD AND APPARATUS FOR RAPIDEXECUTION OF CONTROL TRANSFER INSTRUCTIONS" filed on Mar. 3, 1995 bySunil Savkar;

application Ser. No. 08/397,910 entitled "METHOD AND APPARATUS FORPRIORITIZING AND HANDLING ERRORS IN A COMPUTER SYSTEM" filed on Mar. 3,1995 by Chih-Wei David Chang, Joel Fredrick Boney and Jaspal Kohli;

application Ser. No. 08/397,800 entitled "METHOD AND APPARATUS FORGENERATING ZERO BIT STATUS FLAG IN A MICROPROCESSOR" filed on Mar. 3,1995 by Michael Simone; and

application Ser. No. 08/397,912 entitled "ECC PROTECTED MEMORYORGANIZATION WITH PIPELINED READ-MODIFY-WRITE ACCESS" filed on Mar. 3,1995 by Chien Chen and Yizhi Lu;

each of the above applications having the same assignee as the presentinvention, and each incorporated herein by reference in their entirety.

CROSS REFERENCE TO MICROFICHE APPENDIX

Microfiche Appendix A consists of 8 sheets of 495 frames total ofmicrofiche submitted under 37 C.F.R. § 1.96 and is a part of thisdisclosure. Microfiche Appendix A includes source and object codewritten in AIDA register transfer language specifying a translationlookaside buffer and a table walker in accordance with the presentinvention. In addition, Appendix A contains a functional specificationdocument for the CAM portion of a translation lookaside buffer inaccordance with the present invention.

BACKGROUND OF THE INVENTION

1. Field of The Invention

This disclosure relates to memory management units, in particular memorymanagement units containing a look-aside buffer used to speed uptranslation in a computer system supporting virtual memory, and inparticular, to methods for preventing and recovering from a situationwhere multiple translations in the buffer correspond to the sameaddress. In addition, this disclosure relates to a particular datastructure used to store translations, i.e. a B-tree, that is accessedwhen a requested translation is not present in the look-aside buffer.

2. Technical Background of the Invention

In computers supporting a virtual memory system, the address space towhich programs refer is called "virtual memory" and each virtual addressspecified by a program instruction is translated by the memorymanagement unit (MMU) to a physical or real address which is passed tothe main memory subsystem (hereinafter referred to as "memory") in orderto retrieve the accessed item. The use of virtual memory permits thesize of programs to greatly exceed the size of the physical memory andprovides flexibility in the placement of programs in the physicalmemory. For various reasons, including the need to keep tables requiredfor address translation to a reasonable size, some virtual to realaddress translation schemes effect translation in two or more stages.

Usually, each stage of the translation requires one or more accesses toa table that is held in memory. In order to reduce the total number ofmemory accesses required per address translation, one or moretranslation-lookaside buffers (TLBs) are often provided in the MMU toreduce the average time required to effect a corresponding number ofsteps in the address translation scheme. A TLB is a cache-like memory,typically implemented in Static Random Accessible Memory (SRAM) and/orContent Addressible Memory (CAM), that holds translations correspondingto a particular stage of the translation scheme that have been recentlyfetched from memory.

Access to a TLB entry holding an output address corresponding to aninput address obviates the need for and is typically many orders ofmagnitude faster than access to the in-memory table in order to retrievethe output address corresponding to the input address. (A TLB entry maycontain fields describing the translation, in addition to an input andoutput address fields, such as a protection field. Furthermore, one ormore fields used to determine the output address, instead of the outputaddress itself, may be stored in the TLB entries.)

If the TLB does not contain the requested translation (i.e. upon a TLB"miss") then the MMU initiates a search of translation tables stored inmemory for the requested translation and then loads it into the TLB,where it may be available for subsequent fast access should translationfor the same input address be required at some future point. The part ofthe MMU performing this function, in hardware (logic circuitry), ishereinafter referred to as the "table walker".

Due to errors of various sorts such as soft errors in RAM, hardwaretransient errors and software errors, two or more translations for thesame input address may appear in the TLB. It would be desirable for theMMU to detect that two or more translations exist in the TLB for thespecified input address to be translated and to be able to recover fromthis anomalous situation by taking appropriate action such asinvalidating the two or more translations and initiating a search by thetable walker.

The input address range for the input to a particular stage of theaddress translation scheme in a computer system supporting virtualmemory may be extremely large. For example, in a 64-bit workstation soldby HaL Computer Systems, Inc. (assignee of this disclosure), a 51 bitaddress is translated in the first stage. A simple array with one entryfor each possible input address, as commonly used in the prior art (e.g.page table), is not a feasible solution, in terms of memoryrequirements, for implementing the translation table for such a largeinput address range.

Known data structures to implement the translation table have memoryrequirements proportional to the total number of possible inputaddresses rather than the number of input addresses that have beentranslated, and thus are not practical for very large input addressspaces.

SUMMARY

A method for performing address translation in a computer systemsupporting virtual memory by searching a translation lookaside buffer(TLB) and, possibly, a translation table held in memory and implementedas a B-tree data structure is provided herein. In one embodiment, thesize of each node is the cache-line size for the memory.

The TLB is initially searched for a translation for a specified inputaddress. If exactly one valid entry of the TLB stores a translation forthe specified input address then the output address corresponding to thespecified input address is determined from the contents of that entry.If two or more valid entries of the TLB store a translation for thespecified input address then these entries are invalidated. Thetranslation table is searched for a translation for the specified inputaddress if more than one, or none, of the valid entries of the TLB storea translation for the specified input address.

If a search of the translation table is required then the methodinvolves the retrieval from the translation table, and insertion intothe TLB, of a translation for the specified input address and possiblyone or more translations for other input addresses that are storedtogether with the translation for the specified input address in onenode of the B-tree implementing the translation table. In one embodimentthe B-tree implementing the translation table consists of index nodeswhich store keys (against which the specified input address isspecified) and pointers to other nodes in the translation tree and leafnodes which store translations for one or more input addresses.

During the insertion into the TLB of a translation for a particularinput address that was retrieved from the translation table it isdetermined if there is exactly one valid entry in the TLB that stores atranslation for the particular input address. If so, then thetranslation retrieved from the translation table is inserted into thatentry, thereby avoiding the creation of multiple TLB entries for thesame input address. Otherwise, the translation retrieved from thetranslation table is inserted into an entry of the TLB determined by theTLB replacement policy, which in one embodiment is FIFO (first-infirst-out).

In one embodiment, control signals whose assertion results in a searchof the TLB and a writing into the TLB, respectively, are asserted in thesame cycle. Thus, the insertion into the TLB of a translation for aparticular address retrieved from the translation table is not delayedby the search of the TLB for a translation for the particular addressperformed to avoid the creation of duplicate TLB entries.

Advantageously, the present data structure stores translations forsimilar input addresses in adjacent locations in the memory so that thetable walker retrieves and inserts into the TLB not only the translationfor the specified input address but also one or more translations forinput addresses similar to the specified input address. In this way thetime needed to perform the table search is amortized over severaltranslations. Given the locality of reference that most applicationsprograms exhibit, it is likely that translations will be needed in thenear future for addresses similar to the specified input address whosetranslation is currently requested.

However, it is possible that translations for addresses similar to thespecified input address that are retrieved together with the translationfor the specified address may already be present in the TLB. The presentmethods avoid creating a duplicate entry in the TLB for any translationthat is retrieved from memory by the table walker.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 illustrates a translation lookaside buffer and a table walker inaccordance with this invention.

FIG. 2 illustrates the circuitry in one embodiment used to invalidatethe valid bits of CAM entries.

FIG. 3 depicts the timing for an INSERTION operation performed by theTLB in one embodiment.

FIG. 4 depicts the structure of an index node of a B-tree implementing atranslation table in one embodiment.

FIG. 5 depicts the structure of a leaf node of a B-tree implementing atranslation table in one embodiment.

FIG. 6 depicts a flowchart illustrating the processing performed by atable walker while searching a translation table implemented as aB-tree, in one embodiment.

FIG. 7 depicts a flowchart illustrating the processing by a table walkerof an index node of a translation table implemented as a B-tree, in oneembodiment.

FIG. 8 depicts a flowchart illustrating the processing by a table walkerof a leaf node of a translation table implemented as a B-tree, in oneembodiment.

FIG. 9 depicts a block diagram of a multiple hit detector in oneembodiment, consisting of an encoder and a multiple hit detector.

FIG. 10 depicts an encoder contained in a multiple hit detector in oneembodiment.

FIG. 11 depicts a multiple hit checker contained in a multiple hitdetector in one embodiment.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Overview of thetranslation process

One embodiment of a TLB in accordance with the present invention isdepicted in FIG. 1. TLB 101 includes a content addressable memory (CAM)102 and Static Random Accessible Memory (SRAM) 103. CAM 102 and SRAM 103each contain 128 addressable elements (hereinafter referred to as"entries"). Each entry of CAM 102 contains a valid bit field. If thevalid bit for a particular entry of CAM 102 is asserted then the entry,together with its corresponding entry in SRAM 103, represents a validtranslation. Otherwise, the particular entry is to be ignored. Eachentry of CAM 102 also contains an input address field, whose translatedaddress is stored in the corresponding entry of SRAM 103. (In someembodiments, data used to compute the translated address, rather thanthe translated address itself, is stored in SRAM 103.)

A signal on line 104, indicating an input address (X) whose translationis desired (hereinafter the "specified address"), is received by TLB101. The contents of each entry of CAM 102 are compared with thespecified input address. This task, which is referred to as a MATCHoperation, is described below in further detail.

If exactly one valid entry of CAM 102 stores the specified input addressthen the output address representing the translation of the specifiedinput address is determined from the contents of the corresponding entryof SRAM 103. If two or more valid entries of CAM 102 store the specifiedinput address then these entries are invalidated via an INVALIDATEoperation, which is described in more detail below.

A translation table 116 (which in one embodiment is implemented by aB-tree data structure) stored in memory 106 is searched by table walker115 for a translation for the specified input address if more than one,or none, of the valid entries of CAM 102 store the specified inputaddress. Table walker 115, as discussed below in more detail, mayretrieve from translation table 116 not only a translation for thespecified input address but possibly one or more translations for otherinput addresses.

Each translation retrieved by table walker 115 is inserted into an entryof CAM 102 and corresponding entry of SRAM 103 via an INSERTIONoperation, which is described in more detail below. In order to avoidcreating a duplicate translation when inserting into TLB 101 atranslation for a particular input address retrieved from translationtable 116, a MATCH operation is performed in order to determine if atranslation for that particular address is already stored in an entry ofTLB 101. If exactly one entry of TLB 101 stores a translation for thatparticular address then the translation for that particular address thatwas retrieved by table walker 115 from translation table 116 is insertedinto that exactly one entry, thereby avoiding the creation of multipleTLB entries for the particular address. Otherwise, the TLB entry intowhich the translation retrieved by table walker 115 is inserted isdetermined by the replacement policy for TLB 101.

Operations supported by CAM 102

The TLB 101 supports several operations including MATCH, WRITE,INSERTION and INVALIDATE. Each of these four operations is describedbelow.

TLB MATCH operation

A MATCH operation is performed in two situations. In the firstsituation, a MATCH operation is performed when TLB 101 receives on line104 a signal indicating an input address whose translation is desired.In this situation MUX 107 selects the signal on line 104. If the MATCHoperation in this first situation finds exactly one valid matching entryof CAM 102 then the desired ouput address is retrieved from thecorresponding entry of SRAM 103. Otherwise, table walker 115 must searchtranslation table 116, which is stored in memory 106, for a translationfor the input address indicated by the signal on line 104.

In the second situation, a MATCH operation is performed when TLB 101receives on line 105 a signal indicating an input address whosetranslation has been retrieved from translation table 116 by tablewalker 115. In this situation MUX 107 selects the signal on line 105.

When the signal on line 121 received by CAM match enable port 122 isasserted then a MATCH operation occurs during which the input addressfield of every entry of CAM 102 is simultaneously compared with theinput address (hereinafter "X") indicated by the signal on line 119 thatis received by CAM match port 109.

Match lines 112, MATCH 127:0! (one for each entry of CAM 102), arecoupled to CAM 102 and a multiple hit detector (MHD) 150. If the inputaddress field of ith entry of CAM 102, CAM(i), is equal to ("matches")X, and the valid bit of CAM(i) is asserted then the signal on the ithmatch line, MATCH(i), is asserted. In some embodiments, a process ID,corresponding to the process that generated the virtual addresscurrently being translated by the MMU, is also supplied to CAM 102 andmust be matched, in addition to the input address, to a correspondingfield in an entry of CAM 102. Such embodiments may be used where thetranslation step corresponding to CAM 102 is process id-dependent, inorder to avoid having to invalidate the entire contents of CAM 102 upona context switch, i.e. when a different process begins to execute.

MHD 150 determines whether zero, one or more of the signals on matchlines 112 are asserted and sets two signals, MHIT and HIT, which aretransmitted on lines 113 and 114, respectively. If the signal on exactlyone of match lines 112 is asserted then MHD 150 asserts the HIT signal.MHD 150 asserts the MHIT signal if the signals on more than one, ornone, of match lines 112 are asserted.

In one embodiment, MHD 150 has the structure depicted in FIG. 9. Encoder901 encodes the 128 binary signals carried on match lines 112 into two7-bit signals HAD 6:0! and HADN 6:0! carried on lines 903 and 904,respectively. Multiple hit checker 902 asserts the HIT signal if HAD andHADN are bit-wise complements of each other. Otherwise, multiple hitchecker asserts the MHIT signal. In one embodiment encoder 901 andmultiple hit checker 902 are the circuits depicted in FIGS. 10 and 11respectively.

In FIG. 10, each of the signals on match lines 112, i.e. MATCH 127:0!,is associated with two respective sets of 7 MOSFET transistors. Thelocations of zeros in the 7-bit binary representation for i, where i isbetween 0 and 127 inclusively, determines the members of the first setof 7 MOSFETs associated with MATCH i! whose gates are connected to MATCHi!. For example, if i=1 the binary representation for i is 0000001.Thus, MATCH i! is connected to the gates of MOSFETs 1001, 1002, 1003,1004, 1005, 1006 but not 1007.

The locations of ones in the 7-bit binary representation for i, where iis between 0 and 127 inclusively, determines the members of the firstset of 7 MOSFETs associated with MATCH i! whose gates are connected toMATCH i!. For example, if i=1 the binary representation for i is0000001. Thus, MATCH i! is connected to the gate of MOSFET 10014 but notto the gates of MOSFETs 1008, 1009, 1010, 1011, 1012, 1013 but not 1014.

A table walker 115 (see FIG. 1) receives the MHIT and HIT signals onlines 113 and 114, respectively. In the case of no or multiple matchingentries in CAM 102 upon a MATCH operation for an input address receivedon line 104, table walker 115, upon detecting the assertion of the MHITsignal on line 113, searches translation table 116 (held in memory 106)for a translation for input address X. This search by table walker 115is described in greater detail below. On the other hand, if exactly oneof match lines 112, MATCH(i), is asserted as a result of the MATCHoperation then the above-mentioned table walker search is not initiatedand the output address representing the translation of input address Xis retrieved from the ith entry of SRAM 103, SRAM(i).

The timing of a MATCH operation is as follows. In the first cycle, thesignal on line 108 indicates an input address X to be matched. Also, inthe first cycle, the match enable signal on line 117 is asserted. In thesecond cycle, the signal on line 108 is latched by latch 118 and thusthe signal on line 119, which is received at CAM match port 109,indicates input address X. Also, in the second cycle, the match enablesignal on line 117 is latched by latch 120 and thus the signal on line121, which is received by CAM match enable port 122, is asserted.

The assertion of the signals transmitted on zero, one or more matchlines 112 (corresponding to matching entries of CAM 102) and the settingby MHD 150 of signals MHIT and HIT, transmitted on lines 113 and 114,respectively, are achieved within the first half of the second cycle ofthe MATCH operation.

If the signal on exactly one of match lines 112, MATCH(i), is assertedduring the first half of the second cycle of a MATCH operation then thesignals on match lines 112 are selected by MUX 123 due to the assertionof the HIT signal. Exactly one of 128 output lines 173 of MUX 123 isasserted by the end of the first half of the second cycle. Each outputline 173 is connected to an input line of a respective one of 128 ANDgates 171. TLB word select lines 124 are the respective ouput lines ofAND gates 171. The clock signal is provided to an inverted input of eachof 128 AND gates 171. Thus, during the second half of the second cycle,exactly one of signals transmitted on TLB word select lines 124, WL(i),is asserted.

If the MATCH operation was performed for an input address supplied online 104 then the contents of CAM(i) and SRAM(i) are made available atCAM read port 125 and SRAM read port 126, respectively. The contents ofSRAM(i) contain the desired output address corresponding to the inputaddress X. In some embodiments, a field in SRAM(i) may be used toperform a parity check on the contents of CAM(i). If, on the other hand,the MATCH operation was performed for an input address supplied on line105 then the data indicated by signals 155 and 132 (i.e. a translationretrieved from translation table 116 by table walker 115) are writteninto CAM(i) and SRAM(i), respectively.

TLB INVALIDATE OPERATION

In the case of multiple matching entries in CAM 102 upon a MATCHoperation for an input address received on line 104, logic in TLB 101deasserts the valid bits of the matching entries. In one embodiment, thelogic circuitry deasserting valid bits upon the detection of multiplematching entries is as shown in FIG. 2. The valid bit of each entry ofCAM 102 is implemented by a standard 6-transistor RAM cell. For example,the value of the valid bit for CAM(0) is stored in RAM cell 201 as thesignal on line 202.

The deassertion of the valid bit for CAM(0) upon the occurrence ofmultiple matching entries (including CAM(0)) is achieved as follows.Upon the occurrence of multiple matching entries, more than one of thesignals on match lines 112, including MATCH 0! carried on input line 204of AND gate 203, are asserted. In response, MHD 150 asserts the MHITsignal on line 113. As a result of the assertion of the MHIT signal, thesignal INV₋₋ MATCH on input line 205 of AND gate 203 is asserted. Thus,the signal of an output line 206 of AND gate 203 becomes asserted. Theoutput line 206 is coupled to the gate of a MOSFET 208. The source ofMOSFET 208 is connected to line 202 and the drain of MOSFET 208 isconnected to a ground source 207. The assertion of the signal on line206 turns MOSFET 208 on and thus the signal on line 202 is tied toground, thereby deasserting the valid bit for CAM(0). The deassertion ofthe valid bits of other entries of CAM is similarly achieved.

TLB WRITE Operation

In a WRITE operation, the data indicated by the signals on lines 155 and132 is written into an entry of CAM 102 and corresponding entry of SRAM103, respectively. The WRITE operation is performed in two situations.

In the first situation, the WRITE operation is specified by aninstruction executed by the CPU. In this situation, the CPU supplies asignal on line 138 indicating the element of CAM 102 and SRAM 103 intowhich the write is to occur, a signal on line 151 indicating the data tobe written into CAM 102 and a signal on line 157 indicating data to bewritten into SRAM 103. MUXes 123, 152 and 156 select the signals onlines 140, 151 and 157, respectively.

In the second situation, the data to be written into CAM 102 and SRAM103 is retrieved from translation table 116 and supplied by table walker115 on lines 105 and 129, respectively. A signal on line 138 indicatesthe element of CAM 102 and SRAM 103 into which the write is to occur andis set according to the TLB replacement policy. MUXes 123, 152 and 156select the signals on lines 140, 105 and 129, respectively.

The timing of a WRITE operation is as follows. In the first cycle, awrite enable signal on line 130 is asserted. In the second cycle, thewrite enable signal on line 130 is latched by latch 134 and is thusreceived by CAM write enable port 136 and SRAM write enable port 137.Also, during the second cycle the outputs of MUXes 152 and 156 arelatched by latches 153 and 131, respectively and are thus received byCAM write port 110 and SRAM write port 133, respectively. During thefirst half of the second cycle, the address indicated by the signal online 138 is decoded by address decoder 139. By the beginning of thesecond half of the second cycle one of address decoder lines 140 isasserted and the corresponding output line of MUX 123 is asserted.During the second half of the second cycle the corresponding word selectline 124 is asserted (this assertion is triggered by the falling edge ofthe clock signal which is connected to an inverted input of each of the128 AND gates 171) and the data at CAM write port 110 and SRAM writeport 133 is written into the entry of CAM 102 and corresponding entry ofSRAM 103, respectively, corresponding to the asserted word select line124.

TLB INSERTION Operation

As discussed above, if a unique valid translation for input address X,indicated by the signal on line 104, is not present in TLB 101, thentable walker 115 searches for the desired translation in translationtable 116, a data structure held in memory 106. As described in moredetail below, table walker 115 might retrieve several unrequestedtranslations as a result of this search. Each of the translationsretrieved from memory 106 by table walker 115 is entered into TLB 101via an INSERTION operation. There is a possibility that a retrieved butunrequested translation will already exist in TLB 101. The INSERTIONoperation is designed to avoid creating multiple entries for the sameinput address, without incurring an increase in the time required toinsert those retrieved translations that are not already present in TLB101.

The timing of an INSERTION operation in one embodiment is as illustratedin FIG. 3. In the first cycle, memory 106 generates signals on lines127A and 127B which indicate an input address Y to be inserted into anentry of CAM 102 and a corresponding output address (or, in someembodiments, output data from which an output address is computed) to beinserted into a corresponding entry of SRAM 103, respectively. Together,the signals on lines 127A and 127B constitute a translation to beinserted into TLB 101. (In the embodiment defined by the RTL codeattached in Appendix A, the memory can only transfer eight bytes percycle and thus the signal representing CAM portion of the translation (8bytes) is transferred in one cycle and the signal representing the SRAMportion of the translation is transferred over the following twocycles.)

In the second cycle, buffers 128A and 128B contained in table walker 115latch the signals on lines 127A and 127B, respectively, therebygenerating two signals on lines 105 and 129. When TLB 101 performs anINSERTION operation, MUXes 107 and 152 select the signal on line 105,and as a result the signals on lines 108 and 154 also indicate inputaddress Y during the second cycle. When TLB 101 performs an INSERTIONoperation, MUX 156 selects the signal on line 129, and as a result thesignal on line 158 also indicates an output address corresponding toinput address Y during the second cycle.

Also in the second cycle, the match enable and write enable signals onlines 117 and 130, respectively, are asserted. Thus, an INSERTIONoperation is a combination of a MATCH operation and a WRITE operation.As will be discussed below in more detail, the initiation of the MATCHand WRITE operations in the same cycle (as opposed to performing theMATCH operation first and then deciding whether or not to do a WRITEoperation on the basis of the results of the MATCH operation) by theassertion of the match enable and write enable signals on lines 117 and130, respectively, avoids an increase in the time required to perform anINSERTION operation.

In the third cycle, latches 118 and 153 latch the signals on lines 108and 154, respectively, and thus the signals on lines 119 and 155,received by CAM write port 110 and CAM match port 109, respectively,indicate input address Y. Also, in the third cycle, latch 131 latchesthe signal on line 158 and thus the signal on line 132, received by SRAMwrite port 133, indicates an output address corresponding to inputaddress Y. In the first half of the third cycle, the match enable signalon line 117 is latched by latch 120 and thus, the signal on line 121,which is received by CAM match enable port 122, is asserted. As a resultof the assertion of the signal on line 121 the following occur duringthe first half of the third cycle, as per an ordinary MATCHoperation: 1) input address, Y, is matched against the input addressfields of every entry of CAM 102; 2) zero, one or more of the signalstransmitted on match lines 112 are asserted; 3) multiple hit detector150 determines if the input address Y has matched zero, one or multiplevalid entries in CAM 102 and 4) MHD 150 sets the MHIT and HIT signals onlines 113 and 114, respectively.

The address signals on lines 138 are decoded by address decoder 139 inthe first half cycle of the third cycle and indicate which entry of TLB101 will be used to insert the translation from memory unless there isalready exactly one entry in CAM 102 matching input address Y. Thereplacement policy for TLB 101 determines the way in which the addresssignals on lines 138 are set during an INSERTION operation. In oneembodiment, wherein FIFO (first-in, first-out) replacement is employed,the address signals on lines 138 are initially set to point to CAM(0).After each INSERTION operation, address signal on lines 138 are changedto point to CAM ((i+1) mod 128) when they currently point to CAM(i),unless exactly one CAM entry matches the input address to be insertedinto CAM 102 in which case the address signals on lines 138 are notchanged.

If MHD 150 determines that input address Y matches multiple or noentries of CAM 102 and therefore MHD 150 deasserts the HIT signal online 114 (which drives the select line of MUX 123) in the first half ofthe third cycle, then the signals on address decoder lines 140 areselected by MUX 123 and the CAM word select line 124 corresponding tothe asserted address decoder line 140 is asserted in the second halfcycle of the third cycle (this assertion is triggered by the fallingedge of the clock signal which is connected to an inverted input of eachof the 128 AND gates 171).

Otherwise, i.e. in the case of a single entry of CAM 102 matching inputaddress Y, MHD 150 deasserts the HIT signal on line 114 in the firsthalf of the third cycle and as a result match lines 112 are selected byMUX 123. During the second half of the third cycle, the word select line124 corresponding to the asserted match line 112 is asserted (thisassertion is triggered by the falling edge of the clock signal which isconnected to an inverted input of each of the 128 AND gates 171).

Also, in the third cycle, the write enable signal on line 130 is latchedby latch 134 and thus, the signal on line 135, which is received by CAMwrite enable port 136 and SRAM write enable port 137, is asserted. Thus,during the third cycle the translation data indicated by the signals onlines 119 (i.e. input address) and 132 (output address) is written intothe entry of CAM 102, and corresponding entry of SRAM 103, selected byTLB word select lines 124.

The above INSERTION operation avoids the creation of multiple matchingentries in TLB 101 in the case of a single matching entry alreadypresent in TLB 101 by driving TLB word select lines 124 with match lines112, thereby overriding the matching entry with the translation fetchedfrom memory 106 by table walker 115 as opposed to entering the fetchedtranslation into another TLB entry, i.e. the one indicated by theaddress signals on lines 138. On the other hand, the matching that isperformed during an INSERTION to prevent the creation of multiplematching entries does not increase the time required to insert thetranslation data fetched from memory 106 into the TLB entry indicated bythe address signals on lines 138 in the case of no or multiple matchingCAM entries.

This is because write and match operations are started simultaneously bythe assertion of the match enable and the write enable signal on lines117 and 130, respectively, in the same cycle and because the decoding ofthe address signals on lines 138, indicating the entry of TLB 101 intowhich the translation to be inserted will be placed if there is notexactly one matching entry in TBL 101, occurs simultaneously (during thefirst half of the third cycle) with the matching of the input addressassociated with the translation against the entries of CAM 102.

If memory 106 can supply table walker 115 with one translation percycle, then the above-described INSERTION operation can support aneffective insertion rate of approximately one translation per cycle,since the translations can be pipelined along lines 127A, 105/154 and155 (for writing into CAM 102), along lines 127A, 105/108 and 119 (formatching with CAM 102) and along lines 127B, 129/158 and 132 (forinsertion into SRAM 103). In addition, buffers 128A and 128B in tablewalker 115 only need to be large enough to hold one translation, sincetable walker 115 can send out translations on lines 105 and 129 as fastas table walker 115 receives translations from memory 106.

As discussed above, the actual matching and writing in an INSERTIONoperation for TLB 101 occur in the same cycle. On the other hand, a TLBthat performs insertions by first matching and then writing in asubsequent cycle (depending on the results of the matching) could notsupport an effective rate of insertion exceeding one translation per twocycles, assuming the TLB is incapable of matching a second input addresswhile writing a first input address that was matched in the previouscycle. Also, buffer space sufficient to store N/2 translations would berequired in such a TLB, where N is the maximum number of translationsdelivered by memory during any search of the translation table, iftranslations are delivered from the memory to the table walker at therate of one translation per cycle.

In addition, an INSERTION operation designed to perform a match beforeinitiating the writing of the fetched translation suffers from anotherdisadvantage. Since the maximum effective insertion rate would be onetranslation per two cycles, table walker 115 can only send out onetranslation on signal on lines 105 and 129 every two cycles. Thus, tablewalker requires a buffer that can store approximately N/2 translations,where N is the maximum number of translations delivered by memory 106 inconsecutive cycles.

Fetching Translation Data upon a TLB Miss

When the translation for a particular input address, X, is not presentin TLB 101, table walker 115 searches a data structure held in memory106 (translation table 116), created and maintained by the operatingsystem, that contains translations for the currently running process. Inone embodiment translation table 116, hereinafter referred to as the"translation tree", is implemented as a B-tree. B-trees are well knowndata structures and are defined and described at pages 305-327 in"Design of Database Structures" by Teorey and Fry (1982, Prentice HallInc.), hereby incorporated by reference. As described therein, there areseveral variants of a B-tree. The term B-tree is used herein to denoteany of these and perhaps other variations of the conventional B-tree. Inthe embodiment described immediately below, translation table 116 isimplemented by the variant of the B-tree known as a B*-tree.

Each node of a B*-tree is either a leaf node or an index node. In oneembodiment, each translation tree node is of size equal to the memorycache line size of 128 bytes and each index node of the translation treeis of the form depicted in FIG. 4. Index node 400 consists ofalternating node pointers 401 and keys 402. Each of eight node pointers401 and seven keys 402 is 8 bytes long. Each key 402 stores an inputaddress which may be compared against X, the input address to betranslated. Up to 6 of the last keys in any index node may be nil. Thenode pointer following a nil key is invalid. The values of the non-nilkeys in an index node increase in value from left to right. The last 8bytes of every index node are ignored.

Each node pointer 401 contains the following fields: a node addressfield, which points to another node of the translation tree; a nodeaddress type field which defines the type of address stored in the nodeaddress field (see description below of some possible types of nodeaddress); a node type field, which indicates the type of node (leaf orindex) pointed to by the address in the address field, and a translationnumber field, which indicates the number of valid translations stored inthe leaf node pointed to by the address in the node address field (onlyapplicable where the node type field is `leaf`).

As mentioned above, the address stored in the node address field of anode pointer 401 can take several forms. In various embodiments, thefollowing, as well as possibly other, node address types may beprovided: 1) real: The node address represents a real or physical memoryaddress which can be directly used to retrieve from memory thetranslation tree node pointed to.

2) logical: The node address represents a logical address of some sortwhich of course must itself be translated to a real address before thetranslation tree node pointed to can be retrieved from memory. In someembodiments, this translation is achieved quite speedily. For example,the node address could represent a page offset within the page in whichnode pointer 401 resides, in which case the corresponding real addressis formed merely by concatenating the real page address in which nodepointer 401 resides with the node address stored in node pointer 401. Inanother embodiment, there might be provided in the MMU a specialtranslation look-aside buffer to translate a logical address stored inthe node address field of node pointer 401 into a real address.

In one embodiment, a leaf node of the translation tree is of the formdepicted in FIG. 5. Leaf node 500 occupies 128 bytes (the assumed cacheline size of the memory in this embodiment) and stores 5 translationdescriptors 501 (each occupying 24 bytes). The last 8 bytes of a leafnode are ignored. Each of translation descriptors 501 occupies 24 bytesand consists of a CAM data portion 502, occupying 8 bytes and an SRAMdata portion 503, occupying 16 bytes. CAM data portion 502 and SRAM dataportion 503 contain translation data to be inserted into an entry of CAM102 (e.g. input address) and the corresponding entry of SRAM 103 (e.g.output address), respectively. In addition, CAM data portion 502contains a valid bit which indicates whether or not its associatedtranslation descriptor is valid.

A special register in the MMU is used to hold a node pointer,hereinafter referred to as the "TTRP" (translation tree root pointer),that has the same format as that of node pointers stored in index nodesof the translation tree, as described above, and whose node addressfield contains an address pointing to the root node of the translationtree for the currently executing process. Upon a context switch (i.e.when a different process starts executing), the operating system updatesthe contents of the TTRP register to point to the root node of thetranslation tree for the newly executing process.

The steps involved in a search of the translation tree by table walker116 are illustrated by flowchart 600 of FIG. 6. Processing by tablewalker 116 starts in step 601 where the variable current₋₋ node₋₋pointer, stored in memory is initialized to the node pointer stored inthe TTRP register. Throughout the processing depicted by FIG. 6, thenode address field of current₋₋ node₋₋ pointer points to the currentlysearched translation tree node. Processing transfers from step 601 todecision step 602.

In decision step 602 table walker 115 examines the node type field ofcurrent₋₋ node₋₋ ptr. If the type field indicates "leaf" then processingtransfers from step 602 to step 603. In step 603 table walker 115processes the leaf node pointed to by the node address field ofcurrent₋₋ node₋₋ ptr. From step 603 processing transfers to step 604where processing terminates.

If table walker 115 determines in step 602 that the node type field ofcurrent₋₋ node₋₋ ptr is "index" then processing transfers from step 602to 605. In step 605 table walker 115 processes the index node pointed toby the node address field of current₋₋ node₋₋ ptr. By the end of step605 current₋₋ node₋₋ ptr holds a pointer, stored in the translation treenode just searched and whose node address field points to the nexttranslation tree node to be processed by table walker 115. From step 605processing transfers to step 602.

Step 605 of FIG. 6 for the processing of an index node of thetranslation tree by table walker 115 is illustrated in more detail byflowchart 700 of FIG. 7. Processing begins in step 701 in which tablewalker 115 requests the memory cache line starting at the real addressstored in (or corresponding to, in the case of a logical node address)the node address field of the current₋₋ node₋₋ ptr. Memory 106 takesseveral cycles to set up the read and then sends in a sequence ofconsecutive cycles a fixed number (8, in the embodiment represented bythe RTL code of Appendix A) of bytes of the requested cache line totable walker 115 until all 128 bytes of the cache line have beenreceived. Processing transfers from step 701 to step 702, during whichtable walker 115 stores the first 8 bytes (a node pointer) returned frommemory 106 into current₋₋ node₋₋ ptr. Also, during step 702, a countervariable, i, is initialized to 1. The purpose of counter variable i isto ensure that the loop defined by processing steps 703-705, describedbelow, is not performed more than 7 times.

Processing transfers from step 702 to decision step 703 in which countervariable i is compared to 7. If i is greater than 7 then processingtransfers to step 707 where processing terminates. If i is not greaterthan 7 then processing transfers to step 704 where table walker 115compares the key contained in the next 8 bytes received from the memorywith X, the input address whose translation is desired. If X is lessthan the key then processing transfers to step 707 where processingterminates. On the other hand, if X is not less than the key thenprocessing transfers to step 705 where table walker 115 stores the next8 bytes received from the memory (a node pointer) into current₋₋ node₋₋ptr. Also in step 705, counter variable i is incremented by 1. From step705 processing transfers to step 703.

Step 603 of FIG. 6 for the processing of a leaf node of the translationtree by table walker 115 is illustrated in more detail by flowchart 800of FIG. 8. Processing begins in step 801 in which table walker 115requests the memory cache line starting at the real address stored in(or corresponding to, in the case of a logical node address) the nodeaddress field of the current₋₋ node₋₋ ptr. Memory 106 takes severalcycles to set up the read and then sends in a sequence of consecutivecycles a fixed number (8, in the embodiment represented by the RTL codeof Appendix A) of bytes of the requested cache line to table walker 115until all 128 bytes of the cache line have been received.

As well, a counter variable, i, is set to 1 in step 801. The purpose ofcounter variable is to ensure that the loop defined by steps 802, 806,807, and 808, described further below, is executed no more than 5 times.Also in step 801 the number of valid translations present in thecurrently processed leaf node, as stored in the translation numberaddress field of current₋₋ node_(--ptr), is stored in variable "num₋₋trans". As well, a boolean variable "found", used to indicate whether ornot the desired translation has been located, is initialized to false instep 801.

Processing transfers from step 801 to decision step 802 in which countervariable i is compared to num₋₋ trans. If i is greater than num₋₋ trans,i.e. the expected number of valid translations expected in the currentleaf node, then processing transfers from decision step 802 to decisionstep 803 where the value of boolean variable "found" is examined. If"found" equals false, indicating that the desired translation was notfound, then processing transfers to step 809 where table walker 115generates an interrupt indicating to the operating system that thedesired translation was not found in translation table 116 and causingthe operating system to take appropriate action. Otherwise, processingtransfers to step 804 where processing terminates.

If i is not greater than num₋₋ trans then processing transfers fromdecision step 802 to decision step 806 where the valid bit contained inthe next (i.e. for i=1, the first) 24 bytes (i.e. the next translationdescriptor) of the requested cache line is examined by table walker 115.If the valid bit is not set (thereby indicating that the translationdescriptor is not valid) processing transfers from step 806 to step 805where table walker 115 generates an interrupt to cause the operatingsystem to start executing and to inform the operating system that thecause of the interrupt was the fact that the number of validtranslations found in the current leaf node was less than the numberexpected, i.e. the number stored in variable "num₋₋ trans" in step 801.

If the valid bit is set then processing transfers from step 806 to step807, where boolean variable "found" is set to true if the input addressstored in the first 8 bytes of the currently examined translationdescriptor (i.e. the 24 bytes of the requested cache line referred to instep 806) is equal to X, the input address for whose translation tablewalker 115 is searching. From step 807 processing transfers to step 808where table walker 115 initiates an INSERTION operation in TLB 101 inorder to insert data contained in (and/or, in some embodiments, computedfrom) the first 8 bytes (input address) and the last 16 bytes (outputdata) of the translation descriptor received from the memory currentlyexamined by table walker 115 into an entry of CAM 102 and thecorresponding entry of SRAM, respectively. (As described above, the TLBINSERTION operation is designed to prevent the insertion of a duplicateentry.) Also, in step 808 counter variable i is incremented by 1. Fromstep 808 processing transfers to step 802. After table walker 115 hasprocessed a leaf node of the translation tree, the requested translationdata corresponding to input address X will have been entered into TLB101 (assuming it was found in the leaf node; if not an interrupt isgenerated as described above) as well as the translation datacorresponding to up to four other input addresses.

The extra time required to insert unrequested translations stored in thesame translation tree leaf node as the requested translation is small incomparison to the total time required to locate the requestedtranslation. Given the locality of reference exhibited by most programs(i.e. if a program references input address X it is likely to referenceinput addresses close to X in the near future) and the fact that a leafnode of the translation tree stores translations for closely locatedinput addresses, insertion of unrequested translations should lower themiss rate in the TLB and thus decrease the average time required toperform an address translation. In effect, the cost of performing asearch of translation table 116 may be amortized over two or moretranslations.

This disclosure contains material subject to copyright protection. Thecopyright owner has no objection to the facsimile reproduction by anyoneof the patent document or the patent disclosure, as it appears in thePatent and Trademark Office patent file or records, but otherwisereserves all copyright rights whatsoever.

Microfiche Appendix A contains AIDA source code (RTL) files which whencompiled produce a flattened netlist file. The compilation uses aconventional technology library containing a macro definition for eachmacro invoked in the AIDA RTL files. Using a translation tool, thenetlist file can be converted into an input file for the GARDS placementand routing tool sold by Silverlisco. The output of the GARDS tool canbe used to produce masks to fabricate an integrated circuit for atranslation lookaside buffer and a table walker. On frame 3, of AppendixA is a function specification document for a macro called BVLBCAM whichis the CAM portion of a table lookaside buffer.

Printed Appendix B consists of two papers, "Architecture Overview of HaLSystems" and "Microarchitecture of HaL's Memory Management Unit", and 3parts of the HaL Memory Management Compendium version 4.4 (section 2.1:"Basic Concepts"; chapter 4: "MMU Data Structures" and chapter 10: "ViewLookaside Buffer") of Hal Computer Systems, Inc., that describe variousaspects of a computer system which may include the present invention.

This disclosure is illustrative and not limiting; further modificationswill be apparent to one skilled in the art are intended to fall withinthe scope of the appended claims.

What is claimed is:
 1. An apparatus for address translation in acomputer system that includes a memory, said memory storing a B-treedata structure that contains leaf nodes and index nodes, each of saidleaf nodes storing at least one translation and each of said index nodesstoring at least one pointer pointing to another node of said B-treedata structure, said apparatus comprising:a node pointer examiner,whereby said node pointer examiner determines whether a node of saidB-tree data structure pointed to by the node address field of a nodepointer is a leaf node or an index node; an index noderetriever/processor operatively coupled to said memory and said nodepointer examiner, whereby said index node retriever/processor retrievesfrom said memory and processes said node of said B-tree data structureif said node pointer examiner determines that said node of said B-treedata structure is an index node; and a leaf node retriever/processoroperatively coupled to said memory and said node pointer examiner,whereby said leaf node retriever/processor retrieves from said memoryand processes said node of said B-tree data structure if said nodepointer examiner determines that said node of said B-tree data structureis a leaf node; wherein said leaf node retriever/processor retrievessaid node and inserts each translation stored in said node of saidB-tree data structure into a translation buffer only if a search of saidtranslation buffer finds more than one or none valid entries in saidtranslation buffer storing said translation.